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Abstract 


The  pose  (position  and  orientation)  of  a  polyhedral  object  can  be  determined  with  sparse 
range  data  obtained  from  simple  light-stripe  range  finders.  However,  the  sensing  data  inher¬ 
ently  contains  some  error  which  introduces  uncertainty  in  the  determination  of  the  object  s 
pose.  This  paper  presents  a  method  for  estimating  the  uncertainty  in  determining  the  pose 
of  an  object  when  using  several  light-stripe  range  finders.  Three  dimensional  line  segments 
obtained  by  the  range  finders  are  matched  to  model  faces  based  on  an  interpretation  tree 
search.  The  object  pose  is  obtained  by  a  least  squares  fit  of  the  segment-face  pairings.  We 
show  that  the  uncertainty  in  the  position  of  the  object  can  be  estimated  using  the  covari¬ 
ance  matrix  of  the  endpoint  positions  of  the  sensed  line  segments.  Experiments  with  three 
light-stripe  range  finders  show  that  our  method  makes  it  possible  to  estimate  how  accurately 
the  pose  of  an  object  can  be  determined. 
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1  Introduction 


Recc^nizing  the  pose  of  a  three-dimensional  (3-D)  object  in  a  workspace  is  a  fundamental 
task  in  many  computer  vision  applications,  including  automated  assembly,  inspection,  and 
bin  picking.  Many  object  recognition  algorithms  have  been  developed.  However,  there 
has  been  little  attention  given  to  estimating  the  uncertainty  of  object  pose  determinations. 
In  this  paper,  we  study  a  problem  of  estimating  uncertainty  in  determining  the  pose  of  a 
polyhedral  object  when  using  multiple  light-stripe  range  finders. 

Simple  light-stripe  range  finders  are  among  the  fastest  and  least  expensive  ways  to  acquire 
accurate  range  data.  Multiple  range  finders  viewing  an  object  from  different  perspectives 
can  usually  provide  enough  constraints  to  determine  the  object’s  pose.  Imagine  that  a 
polyhedral  object  is  placed  at  an  arbitrary  pose  in  the  workspace  and  that  we  place  three 
simple  light-stripe  range  finders  above  the  workspace.  Based  on  an  interpretation  tree  search 
technique,  3-D  line  segments  obtained  by  the  range  finders  can  be  assigned  to  model  faces 
consistent  with  geometric  constraints.  Once  a  feasible  interpretation  is  found  that  satisfies 
the  geometric  constraints  for  all  line  segments,  the  transformation  from  the  model  coordinate 
frame  to  the  world  coordinate  frame  is  obtained  by  a  least  squares  method. 

As  a  result  of  sensing  error,  the  transformation  contains  inaccuracies.  Therefore,  we  need 
to  estimate  uncertainty  in  determining  the  pose  of  an  object.  Using  an  error  analysis  base<J 
on  the  convergence  properties  of  the  least  squares  fit,  we  obtain  a  relationship  between  the 
covariance  matrix  of  the  line  segments’  endpoint  positions  and  the  covariance  matrix  of  the 
position  of  each  object  vertex.  The  pose  uncertainty  of  the  object  can  then  be  estimated 
from  this  relationship. 

Related  Work 

Our  object  recognition  method  is  based  on  the  use  of  simple  light-stripe  range  finders. 
Though  many  3-D  object  recognition  systems  using  range  image  information  have  been 
reported  [2],  [5],  [6],  [7],  [16]  and  some  range  imaging  techniques  are  very  fast  [l],  the  rr^og- 
nition  processes  of  these  systems  are  still  very  slow,  making  such  techniques  impractical  for 
industrial  applications.  Recognition  is  slow  because  these  systems  extract  many  surfaces 
and/or  edges  from  raw,  dense  range  images;  this  process  is  time-consuming  and  sometimes 
generates  incorrect  features,  which  cause  difficulty  when  matching  the  features  to  object 
models.  While  a  dense  range  image  is  appropriate  to  describe  a  complex  scene  precisely, 
scenes  in  industrial  applications  can  usually  be  simplified  by  modif)dng  the  environment  to 
enable  object  recognition  using  only  simple  sensors  such  as  light-strire  range  finders. 

It  has  already  been  shown  that  light-stripe  range  finders  are  effective  in  determining  the 
pose  of  polyhedral  objects  in  controlled  environments  where  some  information  about  the 
object’s  pose  is  already  known.  Gordon  and  Seering  [8]  showed  that  object  pose  can  be 
determined  precisely  with  one  simple  light-stripe  range  finder  providing  that  the  a  pnort 
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pose  of  the  object  is  known  approximately.  Chen  [3]  proposed  a  pose  determination  method 
with  three  known  correspondences  between  line  segments  and  model  faces. 

Before  we  determine  the  pose  of  an  object,  we  must  first  determine  feature  correspon¬ 
dences.  To  find  correspondences  between  sensed  features  and  model  features,  an  interpreta¬ 
tion  tree  search  method  with  geometric  constraints  is  used.  Crimson  and  Lozano- Perez  [9] 
demonstrated  that  local  unary  and  binary  geometric  constraints  are  very  effective  in  reduc¬ 
ing  the  size  of  an  interpretation  tree.  While  Crimson  and  Lozano-Perez  used  the  position 
and  local  surface  orientation  of  a  small  set  of  points  on  the  object,  Murray  and  Cook  [14] 
presented  geometric  constraints  for  sensed  edges  corresponding  to  model  edges.  However, 
since  a  light-stripe  range  finder  provides  the  position  and  direction  of  a  3-D  line  segment 
that  lies  on  an  object  face,  different  geometric  conrtrzunts  are  required. 

A  least  squares  method  is  usually  used  to  determine  the  pose  of  an  object,  that  is.  to 
obtain  the  rotation  and  translation  components  of  a  transformation  [6],  [12].  Crimson  [10] 
suggested  that  uncertainty  bounds  on  the  object  pose  can  be  tightened  by  propagating  initial 
errors  algebraically  through  interpretation  equations.  Ellis  [4]  showed  that  the  uncertainty 
bounds  can  be  tightened  by  considering  the  cross-coupling  between  rotational  and  transla- 
tion2J  uncertainties.  In  [4]  and  [10],  sensed  surface  normals  were  used  to  estimate  the  upper 
limit  of  the  transformation  error.  A  weighted  least  squares  method  for  determining  an  error 
bound  on  the  orientation  of  an  object  by  considering  the  contribution  to  the  error  from  each 
sensed  vertex  was  shown  in  [18].  A  criterion  for  choosing  measurement  points  to  minimize 
transformation  error  by  using  a  sensitivity  matrix  was  discussed  in  [17],  Since  the  pose  un¬ 
certainty  of  an  object  can  be  represented  by  the  covariance  matrix  of  the  po.sition  of  each 
object  vertex,  we  explore  a  pose  uncertainty  estimation  method  that  uses  the  covariance 
matrix  of  the  endpoint  positions  of  sensed  line  segments. 

In  this  section,  we  introduced  the  research  objective,  and  reviewed  related  work.  In  Sec¬ 
tion  2  an  interpretation  tree  search  technique  with  geometric  constraints  suitable  for  line 
segments  is  discussed.  In  Section  3  we  focus  on  the  error  analysis  for  object  pose  deter¬ 
mination  and  describe  a  pose  uncertaiinty  estimation  technique.  In  Section  4,  experiments 
with  three  light-stripe  range  finders  show  that  our  object  recognition  method  successfully 
determines  the  pose  of  an  object  and  that  our  pose  uncertainty  estimation  method  provides 
a  useful  tool  for  estimating  how  accurately  the  position  and  orientation  of  an  object  can  be 
determined. 

2  Fast  Object  Recognition  with  Three  Light-Stripe 

Range  Measurements 


The  task  of  model-based  object  recognition  is  to  match  sensed  features  to  mo<lel  feat\ires  and 
to  determine  the  object  pose  in  a  .3-D  world  coor<linate  frame.  We  begin  with  an  example 


2 


Lmtr  Pfittmr  TV 


Figure  1:  A  simple  light-stripe  range  finder. 


Figure  2:  Sensor  placement  for  object  recognition.  Sensors  0  and  1  are  placed  on  the  r 
axis,  directed  toward  the  origin.  Their  sensing  planes,  which  are  displayed  as  triangles, 
perp>€ndicuiarly  intersect.  Sensor  2  is  placed  on  the  x  axis  and  its  sensing  plane  lies  on  the 
x-y  plane. 
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Figure  3:  Obtained  3-D  line  segments  on  object  faces. 


Figure  4:  An  object  recognition  result.  Estimated  transformations  u;(/?a,),  ^{Ry)  and  kIR.) 
are  given  in  degrees  and  t^,  tj,  and  are  given  in  millimeters.  /Z*  is  the  standard  deviation 
of  the  distances  between  the  endpoints  of  the  line  segments  and  the  corresponding  object 
faces.  Tj  shows  the  elapsed  time  in  seconds  (Sun  SPARCstation  IPC). 
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Figure  5:  An  interpietation  tree  for  assignments  between  k  sensed  features  and  m  model 
features. 

of  recognizing  an  object.  A  simple  light-stripe  range  finder  projects  a  light  plane  onto  the 
faces  of  an  object  and  measures  the  3-D  line  segments  created  by  the  light-stripe  as  shown  in 
Figure  1.  Three  identical  range  finders  are  placed  in  the  world  coordinate  frame  as  shown  in 
Figure  2.  We  assume  that  the  light  source  and  viewpoint  of  each  range  finder  are  coincident. 
The  range  finders  obtain  3-D  line  segments  as  shown  in  Figure  3.  Our  matching  scheme  by 
an  interpretation  tree  search  assigns  the  sensed  line  segments  to  the  corresponding  model 
faces  and  uses  geometric  constraints  to  eliminate  inconsistent  segment-face  pairings.  The 
object’s  pose  is  successfully  determined  as  shown  in  Figure  4.  In  this  section,  we  describe 
our  object  recognition  and  pose  determination  technique. 

2.1  Interpretation  Tree  Search  by  Geometric  Constraints 

Let  Si,  S^,  . . .  ,  Sk  denote  sensed  line  segments  and  Mi,  M2,  .  •  •  ,  denote  model  faces. 
In  general,  there  are  m*  ways  of  matching  the  line  segments  to  the  model  faces  assuming 
that  each  line  segment  must  match  to  one  model  face.  Though  such  assignments  can  be 
represented  by  an  interpretation  tree  as  shown  in  Figure  5,  it  is  not  feasible  to  explore 
the  entire  tree  to  find  consistent  interpretations.  Rather,  geometric  constraints  are  used 
to  discard  inconsistent  pairings  while  searching  the  tree  in  a  depth-first  and  backtracking 
manner. 

Crimson  and  LozanoPerez  [9]  showed  that  the  interpretation  tree  search  technique  with 
local  unary  and  binary  geometric  constraints  is  a  useful  method  to  find  a  consistent  set  of 
pairings  {Sl,Mp^),  (52,  A/p,),  {Sk,  Mp^)  where  Mp.  is  the  model  face  which  corresponds 
to  line  segment  5^.  The  unary  constraints  check  the  consistency  of  a  pairing  between  a  line 
segment  and  a  model  face  and  the  binary  constraints  check  the  consistency  of  two  pairings. 
The  specific  constraints  used  in  our  method  are  given  in  Appendix  .4. 

These  unary  and  binary  constraints  are  weaker  than  those  in  Crimson's  work[i  ij  which 
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are  based  on  face  matching,  since  line  segments  carry  less  information  than  faces.  Therefore, 
after  applying  the  unary  and  binary  constraints,  we  apply  triplet  constraints  which  check  a 
triplet  of  pairings  between  line  segments  and  model  faces  to  prune  the  interpretation  tree 
more  efficiently.  As  deeper  nodes  are  reached  in  the  interpretation  tree,  more  possible  triplet 
pairings  exist  making  a  triplet  constraint  check  appear  to  be  time-consuming[l5].  To  speed 
the  process,  we  choose  three  line  segments  and  three  model  faces  under  the  condition  that 
two  of  the  line  segments  must  intersect  each  other.  Since  the  two  line  segments  are  therefore 
coplanar,  two  of  the  three  model  faces  must  be  the  same.  The  intersecting  line  segments 
can  define  the  normal  of  the  model  face  on  which  the  line  segments  lie.  The  normal  of  the 
other  model  fzw:e  can  be  obtained  by  solving  a  quadratic  equation  since  the  normal  must  be 
perpendicular  to  the  direction  vector  of  the  third  line  segment.  Further  details  of  the  triplet 
constraints  may  be  found  in  Appendix  B. 

2.2  Ordering  Sensed  Features 

The  order  in  which  sensed  features  are  matched  is  very  important  since  early  rejection 
of  inconsistent  nodes  results  in  more  efficient  pruning  of  the  interpretation  tree.  In  our 
recognition  algorithm,  intersecting  line  segments  play  an  important  role  in  the  tree  search 
because  the  triplet  constraints  can  be  applied  to  such  line  segments  to  rapidly  eliminate 
segment-face  pairings  that  cannot  be  consistent.  Intersecting  line  segments  should  therefore 
be  used  as  early  as  possible  to  rapidly  prune  the  interpretation  tree  and  save  computation 
time. 

2.3  Computing  Transformations 

2.3.1  Rotation  Component 

Intersecting  line  segments  are  not  only  used  in  the  triplet  constraints,  but  also  to  compute 
the  rotation  matrix  R  of  the  transformation  from  the  model  coordinate  frame  to  the  world 
coordinate  frame  (see  Appendix  B). 

If  there  are  no  intersecting  line  segments,  a  numerical  polynomial-based  technique  is  used 
to  calculate  the  transformation  after  at  least  three  consistent  pairings  between  line  segments 
and  model  faces  are  found.  Let  u,  v  and  w  denote  the  unit  direction  vectors  of  the  three  line 
segments  and  let  a,  6  and  c  denote  the  unit  normal  vectors  of  the  corresponding  model  faces. 
We  can  calculate  a  rotation  matrix,  such  that  the  rotated  vectors  x,  y  and  z  of  the  vectors 
a,  6  and  c  are  orthogonal  to  u,  v  and  w  respectively.  Chen  [3]  has  presented  a  similar 
polynomial  approach  to  solve  the  same  problem  through  a  canonical  configuration  to  reduce 
the  number  of  unknowns  to  two.  It  is  important  to  note  that  there  are  certain  con<litions 
that  must  be  satisfied  by  the  configuration  of  the  surface  normals  and  the  direction  vectors 
to  solve  this  problem. 
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Unfortunately,  these  general  polynomial- based  methods  are  very  sensitive  to  noist.  as 
well  as  computationally  expensive  ince  an  eighth-degree  equation  must  be  solved.  On  the 
other  hand,  our  method  which  uses  intersecting  line  segments  is  very  fast  and  robust  since 
a  transformation  is  obtained  by  solving  a  quadratic  equation  in  the  triplet  constraint  check. 
Polynomial-based  methods  are  therefore  used  only  in  the  rare  cases  in  which  no  intersecting 
line  segments  exist. 


2.3.2  Translation  Component 

Next,  we  solve  the  translation  component  t  of  the  transformation.  A  point  p  in  the  world 
coordinate  frame  is  related  to  a  corresponding  point  P  in  the  model  coordinate  frame 

p=RP  +  t  (1) 


Suppose  that  a  line  segment  5,,  whose  endpoints  are  b,  and  e,,  corresponds  to  a  model  face 
Mp..  Any  point  X  =  {X,Y,  Z)^  on  the  model  face  satisfies  the  equation 

iVjX-pDp.  =  0  (2) 

p% 

where  Np^  and  Dp.  are  the  unit  normal  and  offset  of  the  model  face  A/p,  respectively.  If  the 
point  p  is  on  tiie  line  segment  S;,  the  squared  distance  from  the  point  to  the  corresponding 
model  face  is  given  by 

(A*)=  =  (n;(H-(p-1)) +£>,.)'.  (3) 

The  translation  component  t  is  therefore  obtained  by  minimizing  the  sum  of  the  integral 
of  the  squared  distance  along  each  line  segment  over  all  pairings  of  an  obtained  feasible 
interpretation  (5,-,  A/p,)  for  i  =  1, . . . ,  A: 


E 


(1) 


where  dsi  is  an  element  of  line  segment  5,-. 

If  the  residual  E  of  fitting  the  model  faces  to  the  line  segments  is  small  enough,  and 
if  the  endpoints  of  line  segments  Si  for  t  =  1, . . .  ,/t  pziss  the  additional  test  that  they  are 
near  to  the  model  face  A/p,,  then  this  interpretation  is  regarded  as  a  globally  consistent 
interpretation. 


2.3.3  Refining  the  Transformation 

After  an  interpretation  is  deemed  globally  consistent,  the  rotation  and  translation  compo¬ 
nents  of  the  transformation  are  improved  by  another  least  squares  process.  Both  initial 
rotation  and  translation  values  are  used  simultaneously  to  refine  the  fit  of  the  sensed  line 
segments  to  the  model  faces. 


Table  1;  Recognition  results  for  1000  trials. 


Conditions 

Successful 

trials 

Failed 

trials 

Recognition 
time  (sec) 

U  nary  k.  binary  constraints 
No  triplet  constraints 

No  feature  ordering 

895 

105 

10.1 

Unary  k.  binary  constraints 
Triplet  constraints 

No  feature  ordering 

949 

51 

0.7 

Unary  k  binary  constraints 
Triplet  constraints 

Feature  ordering 

949 

51 

0.’ 

2.4  Simulation 

We  run  simulation  to  test  the  effectiveness  of  our  object  recognition  method.  We  use  a 
polyhedral  object  as  shown  in  Figure  1.  Three  hypothetical  light-stripe  range  finders  are 
placed  in  the  world  coordinat.i  frame  as  shown  in  Figure  2.  The  object  is  then  randomly 
located  in  the  world  coordinate  frame.  A  simulation  proceeds  as  follows; 

•  As  input  data  for  the  recognition  program,  a  range  finder  simulator  calculates  the  line 
segments  which  the  three  light-stripe  range  finders  would  get  from  viewing  the  object. 

•  Feasible  interpretations  are  obtained  by  performing  th^’  interpretation  tree  search  with 
the  geometric  constraints. 

•  Eaoh  feasible  interpretation  is  verified  by  comparing  object  vertices  found  using  the 
recognition  algorithm  with  the  correct  values.  If  all  estimated  positions  of  the  vertices 
are  near  enough  to  corresponding  correct  positions,  the  interpretation  is  regarded  as 
correct. 

The  results  of  1000  trials  are  shown  in  Table  1.  All  failed  trials  correspond  to  multi¬ 
ple  interpretations  which  include  some  correct  and  some  incorrect  interpretations.  .Adding 
the  triplet  constraints  reduces  the  average  recognition  time  to  0.7  seconds  and  the  number 
of  failed  trials  to  half.  The  triplet  constraints  are  very  efficient  not  only  in  pruning  the 
interpretation  tree,  but  in  improving  recognition  performance. 

The  ordering  of  line  segments  is  also  important.  A  typical  example  is  shown  in  Figures  1 
and  6.  The  intersecting  line  segments  No.5  and  N^o.  12  in  Figure  1  play  a  crucial  role  to 
decrease  the  number  of  nodes  of  the  interpretation  tree.  As  a  result  of  ordering  the  line 
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Figure  6:  The  number  'i  nodes  visited  in  the  interpretation  tree  without  feature  ordering 
and  with  feature  ordering.  Ordering  line  segments  for  the  tree  search  dramatically  speeds 
pruning  the  interpretation  tree. 

segments  so  that  the  intersecting  line  segments  are  examined  first,  the  computation  lime  is 
decreased  from  20  seconds  to  0.3  seconds. 

One  problem  with  this  recognition  technique  is  that  it  takes  a  long  time  to  recognize  an 
object  if  there  are  no  intersecting  line  segments.  In  most  trials,  however,  intersecting  line 
segments  appear  on  object  faces,  which  is  a  characteristic  when  using  multiple  range  finders. 
As  a  result,  the  average  computation  time  for  object  recognition  is  about  0.1  second. 


3  Geometric  Uncertainties  in.  Pose  Determination 


Now  we  can  determine  the  pose  of  an  object.  However,  due  to  sensing  error  inherent  in 
measuring  line  segments,  the  obtained  transformation  contains  some  error,  which  causes 
uncertzunty  in  the  position  of  the  object.  This  section  describes  our  technique  for  estimating 
the  pose  uncertainty. 

3.1  Uncertainty 

The  object  pose  itself  is  obtained  by  minimizing  the  sum  of  the  squared  distances  between 
sensed  line  segments  and  corresponding  object  faces,  and  hence  the  transformation  error 
is  defined  as  a  perturbation  around  the  correct  transformation  with  respect  to  the  sensinr 
error. 

Let  the  rotation  component  R  and  tran.slation  component  t  of  the  transformation  be 
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where  w,  and  k  are  rotation  angles  around  i,  y  and  z  axes  in  the  world  coordinate  frame. 
If*  =  (t*,  denotes  the  six  transformation  variables,  the  transformation  error 

is  defined  by 

A*  =  (At,,  Aty,  At.,  Au;,  Aip,  Ak)^. 

In  addition,  we  define  the  sensing  error  of  the  endpoints  (rj.-i,  (ro., 

roi)  of  line  segment  5,'  as 

As  =  (Axi.Ayi,  Ari,...,Ax3fc,  Ayofc,  ArsJ^. 

3.2  Relationship  between  Sensing  Error  and  Transformation  Er¬ 
ror 

Our  object  recognition  technique  finds  pairings  (Si,  A/pJ  between  line  segments  and  model 
faces.  Due  to  sensing  error,  a  point  p  on  a  line  segment  Si  lies  off  the  corresponding  object 
face  A/p.  by  a  distance  Adi  given  by  equation  (3).  As  we  mentioned  in  Section  2.  the  pose 
of  the  object  is  determined  minimizing  the  residual  E  of  equation  (d)  in  terms  of  x.  The 
necessary  condition  for  E  to  reach  an  extremum  is  given  as 

dty  dt,  dut  dtp  Ok 

Now  to  examine  the  uncertainty  in  the  transformation  caused  by  sensing  error,  we  lin¬ 
earize  these  non-linear  equations  around  the  approximate  solution  (xq,  So)  which  corresponds 
to  the  correct  transformation  and  endpoints, 

AAx  “  -BAs  (7) 

where  A  is  the  Hessian  matrix  of  E  with  respect  to  x  and  B  is  the  Jacobian  matrix  of  ~ 
with  respect  to  s. 

Then  we  relate  the  object  vertex  position  error  to  the  transformation  error  Ax.  The 
position  of  a  vertex  Vj  in  the  world  coordinate  frame  is  related  to  a  vertex  Vj  in  the  model 
coordinate  frame  by 

Vj  =  RVj  +  t.  (8) 

The  position  error  A  Vj  is  then  given  by 


Auj  DjAx 
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where  Dj  is  the  Jacobian  matrix  of  Vj  with  respect  to  *.  By  substituting  equation  (7)  into 
equation  (9),  the  relationship  between  the  position  error  and  the  sensing  error  becomes 

Acj- ^ (10) 

The  covariance  matrix  C„^  of  the  vertex  vj  is  given  by 

=  E(AVjAvf} 

=  Dj(A-'B)C,(A-^B)^Dj'  {Hi 

where  C,  is  the  covariance  matrix  of  the  line  segments’  endpoint  positions.  The  eiornents 
of  the  covariance  matrix  show  how  uncertain  the  vertex  position  is,  and  hence  the  j.  >/ 
and  2  components  of  the  position  error  of  each  vertex  can  be  approximated  as 

^ 


3.3  Examples 

The  following  are  some  examples  of  estimating  the  uncertainty  in  pose  determination.  Given 
the  shape  of  an  object,  a  transformation  as  for  the  object  and  a  placement  of  three  light- 
stripe  range  finders,  a  range  finder  simulator  calculates  line  segments  which  would  appear 
on  the  object.  We  assume  that  all  endpoints  of  obtained  line  segments  have  the  same  error 
(zero  mean  Gaussian  white  noise  iV(0, 1))  and  that  any  two  endpoints  are  independently 
measured  and  their  respective  errors  are  not  related  (though  the  mechanism  of  the  sensing 
error  of  a  range  finder  is  complex  in  practice  [13]).  Thus,  the  covariance  matrix  C,  of  the 
endpoint  positions  of  the  line  segments  becomes  the  identity  matrix.  We  can  estimate  the 
uncertainty  of  each  vertex  of  the  object  with  equation  (11). 

Given  a  model  as  shown  in  Figure  1,  a  sensor  placement  as  in  Figure  2.  and  the  same 
transformation  as  in  Figure  4,  an  estimated  uncertainty  on  each  vertex  of  the  object  is  shown 
in  Figure  7.  In  this  figure,  the  lengths  of  three  bars  on  each  vertex  along  z,  y,  z  directions 
are  given  by  equation  (12),  and  show  how  uncertaiin  the  position  of  each  vertex  is.^  The 
position  error  depends  on  the  pose  of  the  object  with  respect  to  the  range  finders,  that  is.  the 
spatial  distribution  of  line  segments  on  the  object  faces.  Another  example  with  a  different 
transformation  is  shown  in  Figure  8.  Figure  9  shows  the  object  in  the  same  pose  as  Figure  8, 
but  with  a  different  sensor  placement.  The  position  error  in  Figure  9  is  much  larger  than 
that  in  Figure  8  as  a  result  of  the  line  segment  distribution  on  the  object  faces. 

In  general,  as  the  number  of  different  faces  on  which  line  segments  fall  increases,  po,«;e 
determination  accuracy  also  increases.  Note  that  position  error  cannot  be  estimated  by 

‘For  display  purpose,  those  lengths  equal  12Avy,,  [2Ai/j^,  and  I2Avj,  respectively. 
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Figure  7;  An  uncertainty  estimation  result  after  recognizing  the  object.  Three  b,ir.s  on  encii 
vertex  show  the  uncertainty  in  pose  determination.  £,(mm)  is  the  average  position  error  of 
all  vertices. 

the  method  described  here  when  the  surface  normals  of  all  the  object  faces  on  which  line 
segments  lie  are  coplanar.  In  this  case,  the  translation  component  cannot  be  estimated  by 
the  pose  determination  technique  because  there  is  an  unconstrained  degree  of  freedom. 


4  Experiments 


This  section  presents  experimental  results  in  recognizing  an  object  and  estimating  pose 
uncertainty.  First,  the  procedure  of  our  e.xperiments,  which  include  image  processing  for 
extracting  2-D  line  segments  and  computing  the  positions  of  3-D  line  segments  in  the  world 
coordinate  frame,  is  described.  Then,  experimental  results  are  shown  and  compared  with 
simulation  results. 

4.1  Procedure  for  Experiments 

Each  light-stripe  range  finder  is  composed  of  a  TV  camera  with  a  16mm  lens  and  a  la.ser 
diode  projector  whose  wavelength  is  670  nm.  The  laser  beam  is  spread  by  a  cylindrical  lens 
to  generate  a  light  plane.  The  baseline  length  between  the  TV  camera  and  the  laser  projector 
is  about  100  mm.  We  place  three  identical  range  finders  above  the  workspace  as  shown  in 
Figure  10.  The  distance  between  each  range  finder  and  the  workspace  center  is  about  360 
mm  and  each  range  finder’s  absolute  accuracy  of  measuring  3-D  coordinates  is  ±  0.-5  mm 
within  the  workspace. 

For  each  range  finder,  line  segments  are  extracted  by  the  following  procedure; 
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Figure  8:  An  uncertainty  estimation  result  with  the  same  sensor  placement  as  in  Figure  7 
but  a  different  object  pose. 


Figure  9:  An  uncertainty  estimation  result  with  the  same  object  pose  as  in  Figure  8  but  a 
different  sensor  placement. 


Figure  10:  Sensor  placement  for  experiments. 


•  Take  an  image  with  the  laser  diode  projector  off. 

•  Take  an  image  with  the  laser  diode  projector  on. 

•  Compute  im?ige  differences  and  detect  edges. 

•  Track  those  edges  and  find  the  endpoints  of  the  2*D  line  segments. 

•  Compute  the  positions  of  3-D  line  segments  using  projective  transformation  coefficients 
(coefficients  are  calculated  during  czdibration). 

Once  ail  the  3-D  line  segments  have  been  found,  apply  the  object  recognition  technique. 
Finally,  estimate  the  uncertainty  of  each  calculated  vertex  position. 

4.2  Experimental  Results 

An  object  like  the  one  depicted  in  Figure  1  is  placed  at  an  arbitrary  pose  in  the  workspace. 
Each  ran^  finder  takes  two  images  (one  with  the  laser  diode  on,  one  with  the  diode  off}  and 
detects  edges  as  shown  in  Figure  11.  Figure  12  shows  obtained  3-D  line  segments  and  object 
recognition  and  position  error  estimation  results.  For  comparison.  Figure  13  shows  a  simu¬ 
lation  result  with  the  same  object  pose  under  the  same  sensor  placement  as  the  experiment 
shown  in  Figure  12.  The  recognition  time  in  the  experiment  is  0.67  sec,  while  only  0.05  sec  in 
the  simulation.  In  the  experiment,  the  geometric  constraints  used  in  the  interpretation  tree 
search  were  weakened  to  allow  for  error  in  the  measurement,  thus,  increasing  the  number  of 
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Figure  11:  Input  images  for  the  three  light-stripe  range  finders  and  extracted  edge  images. 
The  object  is  placed  on  a  suppwrt  cube  whose  size  is  60  x  60  x  60  mm.  The  cube  is  not 
regarded  as  a  part  of  the  object. 

visited  nodes.  Note  that  the  'ine  segments  No.  0  jind  No.  1  and  the  line  segments  No.  6  and 
No.  7  in  Figure  12  are  not  connected.  Edge  tracking  often  fails  to  detect  a  correct  junction 
of  two  line  segments  on  a  concave  object  edge  as  a  result  of  interreflection  of  the  light  plane. 
Nevertheless,  recognition  succeeded  because  our  matching  technique  uses  assignments  of  line 
segments  to  model  faw:es  instead  of  relying  on  exact  matching  of  line  segment  endpoints  to 
model  edges. 

4.3  Absolute  Accuracy 

We  estimated  the  absolute  accuracy  in  pose  determination  with  the  sensor  placement  shown 
in  Figure  10.  The  object  is  located  with  a  known  transformation  (Case  1  ~  6),  and  the  object 
pose  is  estimated  10  times  for  each  transformation.  The  mean  and  standard  deviation  of 
position  errors  (equation  (12))  of  each  vertex  are  calculated.  Table  2  shows  the  averages  of 
the  means  and  standard  deviations  of  the  position  errors  for  all  vertices.  For  all  cases,  the 
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Figure  12:  Experimented  3-D  line  segments  and  object  recognition  and  position  error  esti 
mation  results  for  an  arbitrary  pose. 


Figure  13:  Simulated  3-D  line  segments  and  object  recognition  and  position  error  estimation 
results  for  the  object  pose  shown  in  Figure  12. 
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Table  2:  Absolute  accuracy  in  pose  determination.  The  object  pose  is  estimated  for  the 
object  with  a  known  transformation.  In  all  cases,  u  —  0‘,  <^  =  0*,  and  =  -6.7.5  mm. 


Transformation 

At), 

(mm) 

At)y 

(mm) 

Ai)i 

(mm) 

Case  1 

f,  =  5mm,  ty  =  -5mm 

K  =  0* 

Mean 

-0.46 

-O.IO 

-0.04 

Std 

0.52 

0..30 

0.15 

Case  2 

f,  =  5mm,  iy  =  0mm 

K  =  0* 

Mean 

-0.35 

-0.34 

0.06 

Std 

0.49 

0.38 

0.13 

Case  3 

tx  =  5mm,  ty  =  0mm 
«  =  30‘ 

Mean 

-0.31 

-0.39 

0.24 

Std 

0.20 

0.14 

0.07 

Case  4 

ix  =  0mm,  ty  =  -5mm 

K  =  30" 

Mean 

0.66 

0.22 

0.34 

Std 

0.32 

0.18 

0.09 

Case  5 

tx  =  lOmm,  ty  =  0mm 

K  =  60" 

Mean 

-0.11 

0.04 

0.11 

Std 

0.16 

0.13 

0.15 

Case  6 

tx  =  10mm,  ty  —  -5mm 

K  =  60* 

Mean 

-0.32 

0.06 

0.12 

Std 

0.27 

0.24 

0.16 

standard  deviations  of  vertex  position  errors  are  within  0.6  mm.  These  values  are  consistent 
with  the  simulation  results  for  the  same  transformations.^ 


4.4  Relative  Accuracy 

The  relative  accuracy  in  pose  determination  was  .estimated  as  follows. 

1.  The  object  is  placed  at  an  arbitrary  pose  in  the  workspace. 

2.  The  object  pose  is  estimated  initially. 

3.  The  object  is  moved  in  the  x  direction  by  5mm. 

4.  The  object  pose  is  estimated  again  and  compared  with  the  initial  pose. 

5.  Steps  3  and  4  are  repeated. 

Figure  14  shows  the  experimental  results.  The  estimated  z  component  of  the  translation 
changes  linearly  by  5mm  and  the  y  component  is  almost  constant.  The  difference  between 

^In  the  simulation  the  standard  deviations  of  vertex  position  errors  are  about  0.$  mm  .tssumitiK  the 
measurement  error  of  the  range  finder  to  be  <t  =  0.3  mm. 
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Figure  I4:  Relative  accuracy  in  pose  determination. 

the  actual  and  estimated  translation  components  is  within  ±  0.25  mm.  Similar  e-Yperinients 
for  10  different  initial  object  poses  resulted  in  almost  the  same  relative  accuracy. 

5  Conclusion 

We  have  presented  a  method  for  estimating  uncertainty  in  determining  the  pose  of  a  poly¬ 
hedral  object  when  using  multiple  light-stripe  range  finders. 

An  object  recognition  method  based  on  an  interpretation  tree  search  has  been  used  to 
determine  the  object  pose.  In  this  method,  3-D  line  segments  obtained  by  the  range  finders 
are  consistently  matched  to  model  faces  based  on  geometric  constraints.  We  have  introduced 
triplet  constraints  to  dramatically  speed  pruning  of  the  interpretation  tree. 

We  have  determined  the  relationship  between  uncertainty  in  object  pose  determination 
and  sensing  error.  The  pose  error  of  an  object  caji  be  estimated  from  the  covariance  matri.x 
of  the  endpoint  positions  of  sensed  line  segments. 

Experiments  with  simple  light-stripe  range  finders  show  that  our  method  makes  it  possible 
to  estimate  how  accurately  the  pose  of  an  object  can  be  determined. 
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Appendix 


A  Geometric  Constraints 

This  appendix  gives  unary  and  binary  constraints  used  in  our  object  recognition  method. 

•  Length  constraint  (Unary  constraint) 

If  a  line  segment  lies  on  a  model  face,  the  length  of  the  segment  must  be  less  than  or 
equal  to  the  maximum  distance  between  two  vertices  of  the  model  face. 

•  Distance  constraint  (Binary  constraint) 

If  two  line  segments  lie  on  two  different  model  faces,  the  range  of  distances  between 
the  two  line  segments  must  be  within  the  range  of  distances  between  the  corresponding 
two  model  faces. 

•  Adjacency  constraint  (Binary  constraint) 

Two  adjacent  line  segments  from  the  same  range  finder  must  be  assigned  to  two  adja¬ 
cent  model  faces.^ 

•  Intersection  constraint  (Binary  constraint) 

If  two  line  segments  which  come  from  different  range  finders  intersect  each  other,  these 
line  segments  must  be  assigned  to  the  same  model  face. 

When  a  new  node  at  the  t  th  level  is  reached  in  the  interpretation  tree,  a  new  pairing 
(S,-,  A/p.)  is  generated,  which  must  be  subjected  to  the  unary  constraints.  Also  j  -  1  now 
induced  pairs  of  pairings,  [(5,-,  Mp,),  (Sj,  A/pJ]  for  j  =  1, . . . ,  i  -  1  must  be  subjected  to  the 
binary  constraints. 


B  Triplet  Constraints 

In  this  app)endix,  we  present  two  triplet  constraints  used  in  our  object  recognition  method. 


B.l  Surface  Normal  Constraint 

Intersecting  line  segments  can  define  the  normal  of  the  face  on  which  the  line  segments  lie. 
In  Figure  15,  let  Si  and  s,  denote  the  unit  direction  vectors  of  intersecting  line  segments 

*Two  model  faces  which  share  a  vertex  are  regarded  as  adjacent. 
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Figure  15:  Surface  normal  constraint.  For  the  constraint  to  be  satisfied,  the  transformed 
surface  normal  Rmj  should  be  on  the  plane  ir^  whose  normal  is  S3  and  also  on  the  conical 
surface  defined  by  the  surface  normal  njo  and  the  angle 


Si  and  S2  respectively.  The  unit  normal  of  a  plane  a-io,  on  which  those  line  segments  lie.  is 
represented  by 


ni2 


X  5, 

II X 


(13) 


Let  nil  be  the  unit  normal  of  a  model  face,  Mi,  which  is  assigned  to  the  two  line  segments, 
and  let  R  denote  the  rotation  component  of  the  transformation  from  the  model  coordinate 
frame  to  the  world  coordinate  frame.  The  unit  normal  of  model  face  Mi  in  the  world 
coordinate  frame,  which  is  given  by  Rnii,  is  set  to  equal  the  unit  normal  71,3  of  the  plane 
7ri2  or  — Wia.  One  direction  is  chosen  such  that  the  normal  of  the  plane  7ri2  is  directed  toward 
the  range  finders  from  which  the  line  segments  Si  and  S2  were  obtained. 

Let  S3  denote  another  line  segment  which  does  not  lie  on  the  plane  tti^.  A  possible  model 
face  Mi,  matched  to  the  line  segment  S3,  must  satisfy  the  following  conditions: 


•  The  angle  between  the  two  model  faces  is  invariant  under  a  rigid  transformation,  that 
is,  i{Rm^,Rm^)  =  £(m,,Tn3)  =  (pn. 


The  direction  vector  of  the  third  line  segment  is  perpendicular  to  the  normal  of  the 
assigned  model  face,  that  is,  3;^±RTn^. 


Consequently,  the  unit  normal  Rm^  of  the  transformed  model  face  Mi  can  be  obtained  by 
solving  the  following  equations  simultaneously. 


+  mi^msy  +  mi^rriit 
■^3x’^3x  +  ^3v^3y  +  ^3i"J3r 
+  "»3»  + 


COSV?i3 

0 

1 


(M) 
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Figure  16;  Projection  constraint.  The  projected  points  of  the  endpoints  of  line  segments  SV 

52  and  53  must  be  within  the  ranges  /i  and  /s  respectively. 

where 

mij  ,  Rm^  =  1  ,  =  33» 

mu  /  \  rm,  I  \  S3z  I 

If  no  real  root  exists  to  these  equations,  the  chosen  triplet  [(Si,  Mi),  (S2,  il/i),  (53,  T/s)]  is 
inconsistent,  and  this  interpretation  is  discarded.  Since  the  surface  normals  iZm,  and  iZrrij 
in  Figure  15  correspond  to  two  unit  surface  normals,  m,  and  nij,  in  the  model  coordinate 
frame,  the  rotation  matrix  R  can  be  computed  [9]. 

B.2  Projection  Constraint 

A  triplet  surviving  from  the  surface  normal  constraint  is  subjected  to  another  triplet  con¬ 
straint.  Suppose  that  the  surface  normal  of  a  plane  tt^  is  defined  by  the  vector  product  of 
the  transformed  normals  Rm^  and  Am,  of  two  model  faces  Mi  and  in  Figure  16.'  Let 
P  denote  the  intersection  point  of  the  line  /13  (the  intersection  line  of  the  two  transformed 
model  faces)  with  the  plane  iri3.  When  any  point  on  the  transformed  model  face  Mi  is 
projected  onto  the  plane  ith  alorig  the  direction  of  the  line  /13,  the  projected  point  will  be 
within  the  range  denoted  by  li  on  the  plane  iri3.  Similarly,  the  projection  of  any  point  on  the 
transformed  model  face  M3  will  fall  on  the  range  I3.  Since  the  three  line  segments  So  and 

53  are  on  the  transformed  model  faces  Mi,  Mi  and  M3  respectively,  the  projected  points  of 

’The  two  model  faces  Mi  and  M3  are  not  necessarily  adjacent. 


the  endpoints  of  the  line  segments  must  fall  within  the  corresponding  ranges.  If  one  or  more 
endpoints  are  out  of  the  ranges,  the  triplet  [(Si,  Mi),  (Sj,  Mi),  (S3,  M3)]  is  inconsistent. 
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